Comparing Lexical Relationships Observed within Japanese Collocation Data and Japanese Word Association Norms

نویسندگان

  • Terry Joyce
  • Srdanovicacute
  • Irena
چکیده

While large-scale corpora and various corpus query tools have long been recognized as essential language resources, the value of word association norms as language resources has been largely overlooked. This paper conducts some initial comparisons of the lexical relationships observed within Japanese collocation data extracted from a large corpus using the Japanese language version of the Sketch Engine (SkE) tool (Srdanović et al., 2008) and the relationships found within Japanese word association sets taken from the large-scale Japanese Word Association Database (JWAD) under ongoing construction by Joyce (2005, 2007). The comparison results indicate that while some relationships are common to both linguistic resources, many lexical relationships are only observed in one resource. These findings suggest that both resources are necessary in order to more adequately cover the diverse range of lexical relationships. Finally, the paper reflects briefly on the implementation of association-based word-search strategies into electronic dictionaries proposed by Zock and Bilac (2004) and Zock (2006).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Coling 2008 22 nd International Conference on Computational Linguistics

While large-scale corpora and various corpus query tools have long been recognized as essential language resources, the value of word association norms as language resources has been largely overlooked. This paper conducts some initial comparisons of the lexical relationships observed within Japanese collocation data extracted from a large corpus using the Japanese language version of the Sketc...

متن کامل

Collocations as Word Co-ocurrence Restriction Data - An Application to Japanese Word Processor

Collocations, the combination of specific words are quite useful linguistic resources for NLP in general. The purpose of this paper is to show their usefulness, exemplifying an application to Kanji character decision processes for Japanese word processors. Unlike recent trials of automatic extraction, our collocations were collected manually through many years of intensive investigation of corp...

متن کامل

Extracting Bilingual Collocations from Non-Aligned Parallel Corpora

This paper proposes a new method to find correspondences of uninterrupted collocations from Japanese-English bilingual corpora without sentence-to-sentence alignment. Uninterrupted collocations in English such as “once again”, “give up”, or “gross national product” handled as a single word or a compound word in Japanese, can be automatically extracted with corresponding Japanese words using wor...

متن کامل

Sense Classification of Verbal Polysemy based-on Bilingual Class/Class Association

[n the field of statistical analysis of natural language data, the measure of word/class association has proved to be quite useful for discovering a meaningtiff sense cluster in an arbi trary level of the thesaurus. In this paper, we apply its idea to the sense classification of Japanese verbal polysemy in case frame acquisition from Japanese-English parallel corpora. Measures of bilingual clas...

متن کامل

Large Scale Collocation Data and Their Application to Japanese Word Processor Technology

Word processors or computers used in Japan employ Japanese input method through keyboard stroke combined with Kana (phonetic) character to Kanji (ideographic, Chinese) character conversion technology. The key factor of Kana-to-Kanji conversion technology is how to raise the accuracy of the conversion through the homophone processing, since we have so many homophonic Kanjis. In this paper, we re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008